HMM-based MAP Prediction o Formant Frequencies from N
نویسنده
چکیده
This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling speech using voiced, unvoiced and nonspeech GMMs for every state of each model of a set of HMMs. To predict formant frequencies from a MFCC vector, first a prediction of the speech class (voiced, unvoiced or non-speech) is made. Formant frequencies are predicted from voiced and unvoiced speech using a MAP estimation made using the state-specific GMMs. This ‘HMM-GMM’ prediction of speech class and formant frequencies was evaluated on a male 5000 word unconstrained large vocabulary speaker-independent database.
منابع مشابه
Predicting Formant Frequencies from MFCC Vectors
This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predict...
متن کاملFormant frequency prediction from MFCC vectors in noisy environments
This paper proposes a method of predicting the formant frequencies of a frame of speech from its mel-frequency cepstral coefficient (MFCC) representation. Prediction is achieved through the creation of a Gaussian mixture model (GMM) which models the joint density of formant frequencies and MFCCs. Using this GMM and an input MFCC vector, a maximum a posteriori (MAP) prediction of the formant fre...
متن کاملFormant Prediction from MFCC Vectors
This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCC vectors and formant vectors using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method pred...
متن کاملModelling and ranking of differences across formants of british, australian and american accents
The differences between formants of British, Australian and American English accents are analysed and ranked. An improved formant model based on linear prediction (LP) feature analysis and a two-dimensional(2D) hidden Markov model (HMM) of formants is employed for estimation of the formant frequencies and bandwidths of vowels and diphthongs. Comparative analysis of the formant trajectories, the...
متن کاملFormant analysis and synthesis using hidden Markov models
This paper describes a unifying framework for both formant tracking and speech synthesis using Hidden Markov Models (HMM). The feature vector in the HMM is composed by the first three formant frequencies, their bandwidths and their delta with time. Speech is synthesized by generating the most likely sequence of feature vectors from a HMM, trained with a set of sentences from a given speaker. Hi...
متن کامل